Overview

Dataset statistics

Number of variables16
Number of observations426
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory166.3 KiB
Average record size in memory399.7 B

Variable types

Numeric11
Categorical5

Alerts

Model has a high cardinality: 423 distinct values High cardinality
MSRP is highly correlated with Invoice and 6 other fieldsHigh correlation
Invoice is highly correlated with MSRP and 6 other fieldsHigh correlation
EngineSize is highly correlated with MSRP and 8 other fieldsHigh correlation
Cylinders is highly correlated with MSRP and 8 other fieldsHigh correlation
Horsepower is highly correlated with MSRP and 7 other fieldsHigh correlation
MPG_City is highly correlated with MSRP and 8 other fieldsHigh correlation
MPG_Highway is highly correlated with MSRP and 7 other fieldsHigh correlation
Weight is highly correlated with MSRP and 8 other fieldsHigh correlation
Wheelbase is highly correlated with EngineSize and 6 other fieldsHigh correlation
Length is highly correlated with EngineSize and 4 other fieldsHigh correlation
MSRP is highly correlated with Invoice and 3 other fieldsHigh correlation
Invoice is highly correlated with MSRP and 3 other fieldsHigh correlation
EngineSize is highly correlated with MSRP and 8 other fieldsHigh correlation
Cylinders is highly correlated with MSRP and 8 other fieldsHigh correlation
Horsepower is highly correlated with MSRP and 6 other fieldsHigh correlation
MPG_City is highly correlated with EngineSize and 6 other fieldsHigh correlation
MPG_Highway is highly correlated with EngineSize and 5 other fieldsHigh correlation
Weight is highly correlated with EngineSize and 6 other fieldsHigh correlation
Wheelbase is highly correlated with EngineSize and 5 other fieldsHigh correlation
Length is highly correlated with EngineSize and 4 other fieldsHigh correlation
MSRP is highly correlated with Invoice and 4 other fieldsHigh correlation
Invoice is highly correlated with MSRP and 3 other fieldsHigh correlation
EngineSize is highly correlated with Cylinders and 5 other fieldsHigh correlation
Cylinders is highly correlated with MSRP and 7 other fieldsHigh correlation
Horsepower is highly correlated with MSRP and 6 other fieldsHigh correlation
MPG_City is highly correlated with MSRP and 6 other fieldsHigh correlation
MPG_Highway is highly correlated with EngineSize and 4 other fieldsHigh correlation
Weight is highly correlated with MSRP and 7 other fieldsHigh correlation
Wheelbase is highly correlated with EngineSize and 3 other fieldsHigh correlation
Length is highly correlated with Weight and 1 other fieldsHigh correlation
Origin is highly correlated with MakeHigh correlation
Make is highly correlated with Origin and 1 other fieldsHigh correlation
DriveTrain is highly correlated with MakeHigh correlation
df_index is highly correlated with Make and 1 other fieldsHigh correlation
Make is highly correlated with df_index and 12 other fieldsHigh correlation
Type is highly correlated with DriveTrain and 3 other fieldsHigh correlation
Origin is highly correlated with df_index and 3 other fieldsHigh correlation
DriveTrain is highly correlated with Make and 6 other fieldsHigh correlation
MSRP is highly correlated with Make and 4 other fieldsHigh correlation
Invoice is highly correlated with Make and 4 other fieldsHigh correlation
EngineSize is highly correlated with Make and 11 other fieldsHigh correlation
Cylinders is highly correlated with Make and 8 other fieldsHigh correlation
Horsepower is highly correlated with Make and 10 other fieldsHigh correlation
MPG_City is highly correlated with Make and 8 other fieldsHigh correlation
MPG_Highway is highly correlated with Make and 7 other fieldsHigh correlation
Weight is highly correlated with Make and 7 other fieldsHigh correlation
Wheelbase is highly correlated with Make and 7 other fieldsHigh correlation
Length is highly correlated with Make and 4 other fieldsHigh correlation
df_index is uniformly distributed Uniform
Model is uniformly distributed Uniform
df_index has unique values Unique

Reproduction

Analysis started2021-12-02 16:40:05.456062
Analysis finished2021-12-02 16:40:24.097140
Duration18.64 seconds
Software versionpandas-profiling v3.1.1
Download configurationconfig.json

Variables

df_index
Real number (ℝ≥0)

HIGH CORRELATION
UNIFORM
UNIQUE

Distinct426
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean213.3403756
Minimum0
Maximum427
Zeros1
Zeros (%)0.2%
Negative0
Negative (%)0.0%
Memory size3.5 KiB
2021-12-02T17:40:24.196328image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile21.25
Q1106.25
median212.5
Q3320.75
95-th percentile405.75
Maximum427
Range427
Interquartile range (IQR)214.5

Descriptive statistics

Standard deviation123.9658743
Coefficient of variation (CV)0.5810708543
Kurtosis-1.207238095
Mean213.3403756
Median Absolute Deviation (MAD)107.5
Skewness0.003783557101
Sum90883
Variance15367.53799
MonotonicityStrictly increasing
2021-12-02T17:40:24.354113image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
4271
 
0.2%
1331
 
0.2%
1351
 
0.2%
1361
 
0.2%
1371
 
0.2%
1381
 
0.2%
1391
 
0.2%
1401
 
0.2%
1411
 
0.2%
1421
 
0.2%
Other values (416)416
97.7%
ValueCountFrequency (%)
01
0.2%
11
0.2%
21
0.2%
31
0.2%
41
0.2%
51
0.2%
61
0.2%
71
0.2%
81
0.2%
91
0.2%
ValueCountFrequency (%)
4271
0.2%
4261
0.2%
4251
0.2%
4241
0.2%
4231
0.2%
4221
0.2%
4211
0.2%
4201
0.2%
4191
0.2%
4181
0.2%

Make
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct38
Distinct (%)8.9%
Missing0
Missing (%)0.0%
Memory size26.5 KiB
Toyota
 
28
Chevrolet
 
27
Mercedes-Benz
 
26
Ford
 
23
BMW
 
20
Other values (33)
302 

Length

Max length13
Median length6
Mean length6.474178404
Min length3

Characters and Unicode

Total characters2758
Distinct characters46
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)0.2%

Sample

1st rowAcura
2nd rowAcura
3rd rowAcura
4th rowAcura
5th rowAcura

Common Values

ValueCountFrequency (%)
Toyota28
 
6.6%
Chevrolet27
 
6.3%
Mercedes-Benz26
 
6.1%
Ford23
 
5.4%
BMW20
 
4.7%
Audi19
 
4.5%
Honda17
 
4.0%
Nissan17
 
4.0%
Volkswagen15
 
3.5%
Chrysler15
 
3.5%
Other values (28)219
51.4%

Length

2021-12-02T17:40:24.491931image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
toyota28
 
6.5%
chevrolet27
 
6.3%
mercedes-benz26
 
6.1%
ford23
 
5.4%
bmw20
 
4.7%
audi19
 
4.4%
honda17
 
4.0%
nissan17
 
4.0%
chrysler15
 
3.5%
volkswagen15
 
3.5%
Other values (29)222
51.7%

Most occurring characters

ValueCountFrequency (%)
e241
 
8.7%
a212
 
7.7%
o210
 
7.6%
r173
 
6.3%
i172
 
6.2%
n145
 
5.3%
u143
 
5.2%
s139
 
5.0%
d133
 
4.8%
l100
 
3.6%
Other values (36)1090
39.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter2212
80.2%
Uppercase Letter517
 
18.7%
Dash Punctuation26
 
0.9%
Space Separator3
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e241
10.9%
a212
 
9.6%
o210
 
9.5%
r173
 
7.8%
i172
 
7.8%
n145
 
6.6%
u143
 
6.5%
s139
 
6.3%
d133
 
6.0%
l100
 
4.5%
Other values (14)544
24.6%
Uppercase Letter
ValueCountFrequency (%)
M87
16.8%
C58
11.2%
B55
10.6%
S36
 
7.0%
H30
 
5.8%
T28
 
5.4%
V27
 
5.2%
A26
 
5.0%
F23
 
4.4%
L23
 
4.4%
Other values (10)124
24.0%
Dash Punctuation
ValueCountFrequency (%)
-26
100.0%
Space Separator
ValueCountFrequency (%)
3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin2729
98.9%
Common29
 
1.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e241
 
8.8%
a212
 
7.8%
o210
 
7.7%
r173
 
6.3%
i172
 
6.3%
n145
 
5.3%
u143
 
5.2%
s139
 
5.1%
d133
 
4.9%
l100
 
3.7%
Other values (34)1061
38.9%
Common
ValueCountFrequency (%)
-26
89.7%
3
 
10.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII2758
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e241
 
8.7%
a212
 
7.7%
o210
 
7.6%
r173
 
6.3%
i172
 
6.2%
n145
 
5.3%
u143
 
5.2%
s139
 
5.0%
d133
 
4.8%
l100
 
3.6%
Other values (36)1090
39.5%

Model
Categorical

HIGH CARDINALITY
UNIFORM

Distinct423
Distinct (%)99.3%
Missing0
Missing (%)0.0%
Memory size29.9 KiB
C240 4dr
 
2
G35 4dr
 
2
C320 4dr
 
2
Maxima SE 4dr
 
1
Silverado 1500 Regular Cab
 
1
Other values (418)
418 

Length

Max length39
Median length13
Mean length14.50469484
Min length2

Characters and Unicode

Total characters6179
Distinct characters68
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique420 ?
Unique (%)98.6%

Sample

1st rowMDX
2nd rowRSX Type S 2dr
3rd rowTSX 4dr
4th rowTL 4dr
5th row3.5 RL 4dr

Common Values

ValueCountFrequency (%)
C240 4dr2
 
0.5%
G35 4dr2
 
0.5%
C320 4dr2
 
0.5%
Maxima SE 4dr1
 
0.2%
Silverado 1500 Regular Cab1
 
0.2%
Echo 4dr1
 
0.2%
330i 4dr1
 
0.2%
9-3 Aero 4dr1
 
0.2%
Endeavor XLS1
 
0.2%
Civic Si 2dr hatch1
 
0.2%
Other values (413)413
96.9%

Length

2021-12-02T17:40:24.642733image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
4dr195
 
15.6%
2dr94
 
7.5%
convertible41
 
3.3%
lx22
 
1.8%
ls21
 
1.7%
v619
 
1.5%
se17
 
1.4%
cab15
 
1.2%
coupe15
 
1.2%
s14
 
1.1%
Other values (419)796
63.7%

Most occurring characters

ValueCountFrequency (%)
823
 
13.3%
r575
 
9.3%
d355
 
5.7%
e326
 
5.3%
a323
 
5.2%
4235
 
3.8%
o227
 
3.7%
t225
 
3.6%
S209
 
3.4%
i204
 
3.3%
Other values (58)2677
43.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter3376
54.6%
Uppercase Letter1139
 
18.4%
Space Separator823
 
13.3%
Decimal Number752
 
12.2%
Other Punctuation44
 
0.7%
Dash Punctuation23
 
0.4%
Open Punctuation11
 
0.2%
Close Punctuation11
 
0.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r575
17.0%
d355
10.5%
e326
9.7%
a323
9.6%
o227
 
6.7%
t225
 
6.7%
i204
 
6.0%
n193
 
5.7%
c145
 
4.3%
l143
 
4.2%
Other values (16)660
19.5%
Uppercase Letter
ValueCountFrequency (%)
S209
18.3%
L136
11.9%
C100
8.8%
X94
8.3%
T91
8.0%
E73
 
6.4%
G63
 
5.5%
A58
 
5.1%
R48
 
4.2%
M43
 
3.8%
Other values (16)224
19.7%
Decimal Number
ValueCountFrequency (%)
4235
31.2%
2141
18.8%
0117
15.6%
379
 
10.5%
565
 
8.6%
638
 
5.1%
131
 
4.1%
824
 
3.2%
913
 
1.7%
79
 
1.2%
Other Punctuation
ValueCountFrequency (%)
.39
88.6%
/5
 
11.4%
Space Separator
ValueCountFrequency (%)
823
100.0%
Dash Punctuation
ValueCountFrequency (%)
-23
100.0%
Open Punctuation
ValueCountFrequency (%)
(11
100.0%
Close Punctuation
ValueCountFrequency (%)
)11
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin4515
73.1%
Common1664
 
26.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
r575
 
12.7%
d355
 
7.9%
e326
 
7.2%
a323
 
7.2%
o227
 
5.0%
t225
 
5.0%
S209
 
4.6%
i204
 
4.5%
n193
 
4.3%
c145
 
3.2%
Other values (42)1733
38.4%
Common
ValueCountFrequency (%)
823
49.5%
4235
 
14.1%
2141
 
8.5%
0117
 
7.0%
379
 
4.7%
565
 
3.9%
.39
 
2.3%
638
 
2.3%
131
 
1.9%
824
 
1.4%
Other values (6)72
 
4.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII6179
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
823
 
13.3%
r575
 
9.3%
d355
 
5.7%
e326
 
5.3%
a323
 
5.2%
4235
 
3.8%
o227
 
3.7%
t225
 
3.6%
S209
 
3.4%
i204
 
3.3%
Other values (58)2677
43.3%

Type
Categorical

HIGH CORRELATION

Distinct6
Distinct (%)1.4%
Missing0
Missing (%)0.0%
Memory size25.8 KiB
Sedan
262 
SUV
60 
Sports
47 
Wagon
30 
Truck
 
24

Length

Max length6
Median length5
Mean length4.835680751
Min length3

Characters and Unicode

Total characters2060
Distinct characters22
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSUV
2nd rowSedan
3rd rowSedan
4th rowSedan
5th rowSedan

Common Values

ValueCountFrequency (%)
Sedan262
61.5%
SUV60
 
14.1%
Sports47
 
11.0%
Wagon30
 
7.0%
Truck24
 
5.6%
Hybrid3
 
0.7%

Length

2021-12-02T17:40:24.919588image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-12-02T17:40:25.000980image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
sedan262
61.5%
suv60
 
14.1%
sports47
 
11.0%
wagon30
 
7.0%
truck24
 
5.6%
hybrid3
 
0.7%

Most occurring characters

ValueCountFrequency (%)
S369
17.9%
a292
14.2%
n292
14.2%
d265
12.9%
e262
12.7%
o77
 
3.7%
r74
 
3.6%
U60
 
2.9%
V60
 
2.9%
p47
 
2.3%
Other values (12)262
12.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter1514
73.5%
Uppercase Letter546
 
26.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a292
19.3%
n292
19.3%
d265
17.5%
e262
17.3%
o77
 
5.1%
r74
 
4.9%
p47
 
3.1%
t47
 
3.1%
s47
 
3.1%
g30
 
2.0%
Other values (6)81
 
5.4%
Uppercase Letter
ValueCountFrequency (%)
S369
67.6%
U60
 
11.0%
V60
 
11.0%
W30
 
5.5%
T24
 
4.4%
H3
 
0.5%

Most occurring scripts

ValueCountFrequency (%)
Latin2060
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
S369
17.9%
a292
14.2%
n292
14.2%
d265
12.9%
e262
12.7%
o77
 
3.7%
r74
 
3.6%
U60
 
2.9%
V60
 
2.9%
p47
 
2.3%
Other values (12)262
12.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII2060
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
S369
17.9%
a292
14.2%
n292
14.2%
d265
12.9%
e262
12.7%
o77
 
3.7%
r74
 
3.6%
U60
 
2.9%
V60
 
2.9%
p47
 
2.3%
Other values (12)262
12.7%

Origin
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct3
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size25.6 KiB
Asia
156 
USA
147 
Europe
123 

Length

Max length6
Median length4
Mean length4.232394366
Min length3

Characters and Unicode

Total characters1803
Distinct characters12
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAsia
2nd rowAsia
3rd rowAsia
4th rowAsia
5th rowAsia

Common Values

ValueCountFrequency (%)
Asia156
36.6%
USA147
34.5%
Europe123
28.9%

Length

2021-12-02T17:40:25.109839image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-12-02T17:40:25.196216image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
asia156
36.6%
usa147
34.5%
europe123
28.9%

Most occurring characters

ValueCountFrequency (%)
A303
16.8%
a156
8.7%
i156
8.7%
s156
8.7%
S147
8.2%
U147
8.2%
e123
6.8%
p123
6.8%
o123
6.8%
r123
6.8%
Other values (2)246
13.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter1083
60.1%
Uppercase Letter720
39.9%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a156
14.4%
i156
14.4%
s156
14.4%
e123
11.4%
p123
11.4%
o123
11.4%
r123
11.4%
u123
11.4%
Uppercase Letter
ValueCountFrequency (%)
A303
42.1%
S147
20.4%
U147
20.4%
E123
17.1%

Most occurring scripts

ValueCountFrequency (%)
Latin1803
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
A303
16.8%
a156
8.7%
i156
8.7%
s156
8.7%
S147
8.2%
U147
8.2%
e123
6.8%
p123
6.8%
o123
6.8%
r123
6.8%
Other values (2)246
13.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII1803
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A303
16.8%
a156
8.7%
i156
8.7%
s156
8.7%
S147
8.2%
U147
8.2%
e123
6.8%
p123
6.8%
o123
6.8%
r123
6.8%
Other values (2)246
13.6%

DriveTrain
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct3
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size25.6 KiB
Front
226 
Rear
108 
All
92 

Length

Max length5
Median length5
Mean length4.314553991
Min length3

Characters and Unicode

Total characters1838
Distinct characters10
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAll
2nd rowFront
3rd rowFront
4th rowFront
5th rowFront

Common Values

ValueCountFrequency (%)
Front226
53.1%
Rear108
25.4%
All92
21.6%

Length

2021-12-02T17:40:25.290592image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-12-02T17:40:25.375490image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
front226
53.1%
rear108
25.4%
all92
21.6%

Most occurring characters

ValueCountFrequency (%)
r334
18.2%
t226
12.3%
n226
12.3%
o226
12.3%
F226
12.3%
l184
10.0%
a108
 
5.9%
e108
 
5.9%
R108
 
5.9%
A92
 
5.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter1412
76.8%
Uppercase Letter426
 
23.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r334
23.7%
t226
16.0%
n226
16.0%
o226
16.0%
l184
13.0%
a108
 
7.6%
e108
 
7.6%
Uppercase Letter
ValueCountFrequency (%)
F226
53.1%
R108
25.4%
A92
21.6%

Most occurring scripts

ValueCountFrequency (%)
Latin1838
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
r334
18.2%
t226
12.3%
n226
12.3%
o226
12.3%
F226
12.3%
l184
10.0%
a108
 
5.9%
e108
 
5.9%
R108
 
5.9%
A92
 
5.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII1838
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
r334
18.2%
t226
12.3%
n226
12.3%
o226
12.3%
F226
12.3%
l184
10.0%
a108
 
5.9%
e108
 
5.9%
R108
 
5.9%
A92
 
5.0%

MSRP
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct409
Distinct (%)96.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean32804.5493
Minimum10280
Maximum192465
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.8 KiB
2021-12-02T17:40:25.509804image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum10280
5-th percentile13685
Q120324.75
median27807.5
Q339225
95-th percentile72958.75
Maximum192465
Range182185
Interquartile range (IQR)18900.25

Descriptive statistics

Standard deviation19472.46082
Coefficient of variation (CV)0.5935902563
Kurtosis13.80083374
Mean32804.5493
Median Absolute Deviation (MAD)8398.5
Skewness2.789290886
Sum13974738
Variance379176730.6
MonotonicityNot monotonic
2021-12-02T17:40:25.657471image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
339952
 
0.5%
299952
 
0.5%
196352
 
0.5%
499952
 
0.5%
198602
 
0.5%
215952
 
0.5%
284952
 
0.5%
359402
 
0.5%
234952
 
0.5%
344952
 
0.5%
Other values (399)406
95.3%
ValueCountFrequency (%)
102801
0.2%
105391
0.2%
107601
0.2%
109951
0.2%
111551
0.2%
112901
0.2%
115601
0.2%
116901
0.2%
118391
0.2%
119051
0.2%
ValueCountFrequency (%)
1924651
0.2%
1284201
0.2%
1266701
0.2%
1217701
0.2%
948201
0.2%
905201
0.2%
897651
0.2%
869951
0.2%
869701
0.2%
846001
0.2%

Invoice
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct423
Distinct (%)99.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean30040.65493
Minimum9875
Maximum173560
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.8 KiB
2021-12-02T17:40:25.819758image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum9875
5-th percentile12834.75
Q118836
median25521.5
Q335754.75
95-th percentile66574.25
Maximum173560
Range163685
Interquartile range (IQR)16918.75

Descriptive statistics

Standard deviation17679.43012
Coefficient of variation (CV)0.5885168004
Kurtosis13.86675651
Mean30040.65493
Median Absolute Deviation (MAD)7609
Skewness2.825874664
Sum12797319
Variance312562249.4
MonotonicityNot monotonic
2021-12-02T17:40:25.969072image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
142072
 
0.5%
683062
 
0.5%
196382
 
0.5%
378861
 
0.2%
198101
 
0.2%
249091
 
0.2%
136501
 
0.2%
249151
 
0.2%
133931
 
0.2%
269661
 
0.2%
Other values (413)413
96.9%
ValueCountFrequency (%)
98751
0.2%
101071
0.2%
101441
0.2%
103191
0.2%
106421
0.2%
107051
0.2%
108961
0.2%
109651
0.2%
111161
0.2%
112091
0.2%
ValueCountFrequency (%)
1735601
0.2%
1196001
0.2%
1178541
0.2%
1133881
0.2%
883241
0.2%
843251
0.2%
809391
0.2%
799781
0.2%
792261
0.2%
764171
0.2%

EngineSize
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct42
Distinct (%)9.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.205633803
Minimum1.4
Maximum8.3
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.5 KiB
2021-12-02T17:40:26.117866image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum1.4
5-th percentile1.8
Q12.4
median3
Q33.9
95-th percentile5.3
Maximum8.3
Range6.9
Interquartile range (IQR)1.5

Descriptive statistics

Standard deviation1.103520014
Coefficient of variation (CV)0.344243941
Kurtosis0.5596129117
Mean3.205633803
Median Absolute Deviation (MAD)0.8
Skewness0.7210462455
Sum1365.6
Variance1.217756421
MonotonicityNot monotonic
2021-12-02T17:40:26.309115image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=42)
ValueCountFrequency (%)
342
 
9.9%
3.534
 
8.0%
230
 
7.0%
2.526
 
6.1%
2.423
 
5.4%
1.823
 
5.4%
4.621
 
4.9%
4.220
 
4.7%
3.218
 
4.2%
3.817
 
4.0%
Other values (32)172
40.4%
ValueCountFrequency (%)
1.41
 
0.2%
1.56
 
1.4%
1.610
 
2.3%
1.74
 
0.9%
1.823
5.4%
1.93
 
0.7%
230
7.0%
2.215
3.5%
2.313
3.1%
2.423
5.4%
ValueCountFrequency (%)
8.31
 
0.2%
6.81
 
0.2%
66
1.4%
5.73
 
0.7%
5.62
 
0.5%
5.53
 
0.7%
5.42
 
0.5%
5.35
1.2%
58
1.9%
4.82
 
0.5%

Cylinders
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct7
Distinct (%)1.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.807511737
Minimum3
Maximum12
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.5 KiB
2021-12-02T17:40:26.485884image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum3
5-th percentile4
Q14
median6
Q36
95-th percentile8
Maximum12
Range9
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.558442633
Coefficient of variation (CV)0.2683494591
Kurtosis0.4403783249
Mean5.807511737
Median Absolute Deviation (MAD)2
Skewness0.5927851991
Sum2474
Variance2.428743441
MonotonicityNot monotonic
2021-12-02T17:40:26.626199image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
6190
44.6%
4136
31.9%
887
20.4%
57
 
1.6%
123
 
0.7%
102
 
0.5%
31
 
0.2%
ValueCountFrequency (%)
31
 
0.2%
4136
31.9%
57
 
1.6%
6190
44.6%
887
20.4%
102
 
0.5%
123
 
0.7%
ValueCountFrequency (%)
123
 
0.7%
102
 
0.5%
887
20.4%
6190
44.6%
57
 
1.6%
4136
31.9%
31
 
0.2%

Horsepower
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct109
Distinct (%)25.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean215.8779343
Minimum73
Maximum500
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.5 KiB
2021-12-02T17:40:26.820945image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum73
5-th percentile115
Q1165
median210
Q3255
95-th percentile338.75
Maximum500
Range427
Interquartile range (IQR)90

Descriptive statistics

Standard deviation71.99103952
Coefficient of variation (CV)0.3334803057
Kurtosis1.534797286
Mean215.8779343
Median Absolute Deviation (MAD)45
Skewness0.928996008
Sum91964
Variance5182.709771
MonotonicityNot monotonic
2021-12-02T17:40:26.980235image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
20017
 
4.0%
21014
 
3.3%
21514
 
3.3%
22513
 
3.1%
24013
 
3.1%
22012
 
2.8%
14012
 
2.8%
30011
 
2.6%
17011
 
2.6%
13010
 
2.3%
Other values (99)299
70.2%
ValueCountFrequency (%)
731
 
0.2%
931
 
0.2%
1001
 
0.2%
1035
1.2%
1043
0.7%
1085
1.2%
1102
 
0.5%
1156
1.4%
1171
 
0.2%
1192
 
0.5%
ValueCountFrequency (%)
5001
 
0.2%
4933
0.7%
4771
 
0.2%
4501
 
0.2%
4201
 
0.2%
3904
0.9%
3502
 
0.5%
3492
 
0.5%
3451
 
0.2%
3406
1.4%

MPG_City
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct28
Distinct (%)6.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20.07042254
Minimum10
Maximum60
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.5 KiB
2021-12-02T17:40:27.124546image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum10
5-th percentile14
Q117
median19
Q321.75
95-th percentile29
Maximum60
Range50
Interquartile range (IQR)4.75

Descriptive statistics

Standard deviation5.248616025
Coefficient of variation (CV)0.2615099914
Kurtosis15.71070077
Mean20.07042254
Median Absolute Deviation (MAD)2
Skewness2.773375104
Sum8550
Variance27.54797017
MonotonicityNot monotonic
2021-12-02T17:40:27.264863image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=28)
ValueCountFrequency (%)
1867
15.7%
2057
13.4%
1741
9.6%
2138
8.9%
1937
8.7%
1631
7.3%
2622
 
5.2%
2422
 
5.2%
2218
 
4.2%
1517
 
4.0%
Other values (18)76
17.8%
ValueCountFrequency (%)
102
 
0.5%
124
 
0.9%
1312
 
2.8%
1413
 
3.1%
1517
 
4.0%
1631
7.3%
1741
9.6%
1867
15.7%
1937
8.7%
2057
13.4%
ValueCountFrequency (%)
601
 
0.2%
591
 
0.2%
461
 
0.2%
381
 
0.2%
361
 
0.2%
352
 
0.5%
331
 
0.2%
327
1.6%
311
 
0.2%
297
1.6%

MPG_Highway
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct33
Distinct (%)7.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean26.85446009
Minimum12
Maximum66
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.5 KiB
2021-12-02T17:40:27.399186image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum12
5-th percentile18
Q124
median26
Q329
95-th percentile36
Maximum66
Range54
Interquartile range (IQR)5

Descriptive statistics

Standard deviation5.752334876
Coefficient of variation (CV)0.2142040784
Kurtosis6.008641432
Mean26.85446009
Median Absolute Deviation (MAD)3
Skewness1.245622363
Sum11440
Variance33.08935653
MonotonicityNot monotonic
2021-12-02T17:40:27.690304image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=33)
ValueCountFrequency (%)
2654
12.7%
2543
 
10.1%
2838
 
8.9%
2934
 
8.0%
2728
 
6.6%
3024
 
5.6%
2424
 
5.6%
1916
 
3.8%
2116
 
3.8%
2316
 
3.8%
Other values (23)133
31.2%
ValueCountFrequency (%)
121
 
0.2%
131
 
0.2%
141
 
0.2%
162
 
0.5%
179
2.1%
1811
2.6%
1916
3.8%
2013
3.1%
2116
3.8%
2213
3.1%
ValueCountFrequency (%)
661
 
0.2%
512
 
0.5%
461
 
0.2%
441
 
0.2%
432
 
0.5%
403
0.7%
391
 
0.2%
383
0.7%
375
1.2%
365
1.2%

Weight
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct346
Distinct (%)81.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3580.474178
Minimum1850
Maximum7190
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.5 KiB
2021-12-02T17:40:27.848097image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum1850
5-th percentile2513
Q13111.25
median3476
Q33979.25
95-th percentile4996.75
Maximum7190
Range5340
Interquartile range (IQR)868

Descriptive statistics

Standard deviation759.8700728
Coefficient of variation (CV)0.2122261005
Kurtosis1.676000966
Mean3580.474178
Median Absolute Deviation (MAD)428
Skewness0.8845782783
Sum1525282
Variance577402.5276
MonotonicityNot monotonic
2021-12-02T17:40:28.012633image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
31754
 
0.9%
32854
 
0.9%
34504
 
0.9%
34283
 
0.7%
31973
 
0.7%
40573
 
0.7%
38033
 
0.7%
25243
 
0.7%
26763
 
0.7%
34703
 
0.7%
Other values (336)393
92.3%
ValueCountFrequency (%)
18501
0.2%
20351
0.2%
20551
0.2%
20851
0.2%
21951
0.2%
22551
0.2%
22901
0.2%
23391
0.2%
23401
0.2%
23481
0.2%
ValueCountFrequency (%)
71901
0.2%
64001
0.2%
61331
0.2%
59691
0.2%
58791
0.2%
56781
0.2%
55901
0.2%
54641
0.2%
54401
0.2%
54231
0.2%

Wheelbase
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct40
Distinct (%)9.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean108.1643192
Minimum89
Maximum144
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.5 KiB
2021-12-02T17:40:28.147446image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum89
5-th percentile95.25
Q1103
median107
Q3112
95-th percentile123
Maximum144
Range55
Interquartile range (IQR)9

Descriptive statistics

Standard deviation8.330030387
Coefficient of variation (CV)0.07701273807
Kurtosis2.108267519
Mean108.1643192
Median Absolute Deviation (MAD)5
Skewness0.9569295076
Sum46078
Variance69.38940624
MonotonicityNot monotonic
2021-12-02T17:40:28.294753image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=40)
ValueCountFrequency (%)
10745
 
10.6%
10330
 
7.0%
11225
 
5.9%
10625
 
5.9%
10424
 
5.6%
10521
 
4.9%
11520
 
4.7%
10917
 
4.0%
11117
 
4.0%
10216
 
3.8%
Other values (30)186
43.7%
ValueCountFrequency (%)
892
 
0.5%
939
2.1%
9511
2.6%
965
 
1.2%
973
 
0.7%
9811
2.6%
9911
2.6%
1007
1.6%
10116
3.8%
10216
3.8%
ValueCountFrequency (%)
1442
0.5%
1401
 
0.2%
1371
 
0.2%
1332
0.5%
1311
 
0.2%
1304
0.9%
1292
0.5%
1282
0.5%
1262
0.5%
1243
0.7%

Length
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct67
Distinct (%)15.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean186.4201878
Minimum143
Maximum238
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.5 KiB
2021-12-02T17:40:28.465530image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum143
5-th percentile163
Q1178
median187
Q3194
95-th percentile212
Maximum238
Range95
Interquartile range (IQR)16

Descriptive statistics

Standard deviation14.3666105
Coefficient of variation (CV)0.07706574419
Kurtosis0.6176732951
Mean186.4201878
Median Absolute Deviation (MAD)9
Skewness0.1733442416
Sum79415
Variance206.3994974
MonotonicityNot monotonic
2021-12-02T17:40:28.644309image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
17827
 
6.3%
19022
 
5.2%
18717
 
4.0%
19216
 
3.8%
18815
 
3.5%
19114
 
3.3%
17914
 
3.3%
20013
 
3.1%
17713
 
3.1%
18312
 
2.8%
Other values (57)263
61.7%
ValueCountFrequency (%)
1431
 
0.2%
1441
 
0.2%
1501
 
0.2%
1532
0.5%
1541
 
0.2%
1552
0.5%
1562
0.5%
1582
0.5%
1593
0.7%
1601
 
0.2%
ValueCountFrequency (%)
2381
 
0.2%
2301
 
0.2%
2271
 
0.2%
2241
 
0.2%
2222
 
0.5%
2212
 
0.5%
2193
0.7%
2183
0.7%
2152
 
0.5%
2127
1.6%

Interactions

2021-12-02T17:40:22.172897image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:08.655866image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:09.990512image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:11.491043image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:12.925687image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:14.250977image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:15.682108image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:16.904713image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:18.142921image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:19.544253image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:20.785216image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:22.282271image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:08.778523image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:10.110853image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:11.610401image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:13.035543image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:14.382304image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:15.786461image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:17.019562image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:18.251269image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:19.652116image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:20.890578image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:22.405591image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:08.896368image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:10.252666image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:11.733724image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:13.159880image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:14.514635image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:15.903310image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:17.138406image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:18.376605image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:19.771581image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:21.013926image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:22.519442image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:09.108591image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:10.386992image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:11.852569image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:13.283218image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:14.646456image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:16.020164image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:17.252756image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:18.646267image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:19.888940image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:21.129774image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:22.635290image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:09.223454image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:10.536795image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:11.970914image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:13.412548image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:14.758311image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:16.133017image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:17.370601image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:18.761115image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:20.000780image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:21.245613image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:22.744648image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:09.332326image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:10.676113image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:12.082267image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:13.540406image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:14.866668image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:16.242863image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:17.476971image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:18.870458image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:20.109636image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:21.352972image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:22.854504image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:09.437672image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:10.822920image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:12.330443image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:13.663745image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:14.979020image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:16.348235image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:17.591811image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:18.975820image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:20.214509image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:21.460830image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:22.963757image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:09.547538image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:10.987703image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:12.445302image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:13.782589image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:15.086389image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:16.458590image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:17.706660image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:19.092167image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:20.325862image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:21.707009image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:23.078106image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:09.659391image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:11.118532image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:12.565649image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:13.898437image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:15.198732image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:16.571432image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:17.814519image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:19.204520image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:20.445205image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:21.824353image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:23.188909image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:09.766253image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:11.242878image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:12.683480image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:14.016782image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:15.316079image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:16.682303image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:17.922701image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:19.317373image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:20.559512image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:21.949190image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:23.299773image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:09.878168image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:11.368206image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:12.808326image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:14.132140image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:15.576237image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:16.794139image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:18.034064image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:19.428231image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:20.674361image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-02T17:40:22.062042image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Correlations

2021-12-02T17:40:28.785609image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-12-02T17:40:28.993351image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-12-02T17:40:29.197084image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-12-02T17:40:29.373348image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.
2021-12-02T17:40:29.509661image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2021-12-02T17:40:23.535454image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
A simple visualization of nullity by column.
2021-12-02T17:40:23.853038image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

df_indexMakeModelTypeOriginDriveTrainMSRPInvoiceEngineSizeCylindersHorsepowerMPG_CityMPG_HighwayWeightWheelbaseLength
00AcuraMDXSUVAsiaAll36945333373.56.026517234451106189
11AcuraRSX Type S 2drSedanAsiaFront23820217612.04.020024312778101172
22AcuraTSX 4drSedanAsiaFront26990246472.44.020022293230105183
33AcuraTL 4drSedanAsiaFront33195302993.26.027020283575108186
44Acura3.5 RL 4drSedanAsiaFront43755390143.56.022518243880115197
55Acura3.5 RL w/Navigation 4drSedanAsiaFront46100411003.56.022518243893115197
66AcuraNSX coupe 2dr manual SSportsAsiaRear89765799783.26.029017243153100174
77AudiA4 1.8T 4drSedanEuropeFront25940235081.84.017022313252104179
88AudiA41.8T convertible 2drSedanEuropeFront35940325061.84.017023303638105180
99AudiA4 3.0 4drSedanEuropeFront31840288463.06.022020283462104179

Last rows

df_indexMakeModelTypeOriginDriveTrainMSRPInvoiceEngineSizeCylindersHorsepowerMPG_CityMPG_HighwayWeightWheelbaseLength
416418VolvoS60 2.5 4drSedanEuropeAll31745299162.55.020820273903107180
417419VolvoS60 T5 4drSedanEuropeFront34845329022.35.024720283766107180
418420VolvoS60 R 4drSedanEuropeAll37560353822.55.030018253571107181
419421VolvoS80 2.9 4drSedanEuropeFront37730355422.96.020820283576110190
420422VolvoS80 2.5T 4drSedanEuropeAll37885356882.55.019420273691110190
421423VolvoC70 LPT convertible 2drSedanEuropeFront40565382032.45.019721283450105186
422424VolvoC70 HPT convertible 2drSedanEuropeFront42565400832.35.024220263450105186
423425VolvoS80 T6 4drSedanEuropeFront45210425732.96.026819263653110190
424426VolvoV40WagonEuropeFront26135246411.94.017022292822101180
425427VolvoXC70WagonEuropeAll35145331122.55.020820273823109186